Bioinformatics A Practical Guide to Next Generation Sequencing Data Analysis (Hamid D. Ismail)

RNA-Seq Data Analysis ◾ 211

The expression level of a gene is described by the log-fold change of the gene transcript

abundance with respect to a reference. This reference will be a different gene in the case of

intra-sample comparison or the same gene in a different sample in the case of inter-sample

differential analysis. The significance of differential gene expression is measured by the

fold change, test statistic, p-value, adjusted p-value, and FDR. In the case of multiple com-

parison, the adjusted p-value is used. The expressed genes are usually sorted by p-values

(from the lowest to the largest) so that it will be easy to identify the top upregulated and

downregulated genes. These genes can be further analyzed by annotating them with their

GO and KEGG pathways terms to gain insight into their biological interpretation under

the conditions studied.

REFERENCES

1. Reyes JL, Kois P, Konforti BB, Konarska MM: The canonical GU dinucleotide at the 5’

splice site is recognized by p220 of the U5 snRNP within the spliceosome. Rna 1996,

2(3):213–225.

2. Wilkinson ME, Charenton C, Nagai K: RNA Splicing by the spliceosome. Annu Rev Biochem

2020, 89: 359–388.

3. Haque A, Engel J, Teichmann SA, Lönnberg T: A practical guide to single-cell RNA-sequencing

for biomedical research and clinical applications. Genome Med 2017, 9(1):75.

4. Vu TN, Deng W, Trac QT, Calza S, Hwang W, Pawitan Y: A fast detection of fusion genes from

paired-end RNA-seq data. BMC Genomics 2018, 19(1):786.

5. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras

TR: STAR: ultrafast universal RNA-seq aligner. Bioinformatics 2013, 29(1):15–21.

6. Hoffmann S, Otto C, Kurtz S, Sharma CM, Khaitovich P, Vogel J, Stadler PF, Hackermüller

J: Fast mapping of short sequences with mismatches, insertions and deletions using index

structures. PLoS Comput Biol 2009, 5(9):e1000502.

7. Marco-Sola S, Sammeth M, Guigó R, Ribeca P: The GEM mapper: fast, accurate and versatile

alignment by filtration. Nat Methods 2012, 9(12):1185–1188.

8. Li H, Durbin R: Fast and accurate long-read alignment with Burrows-Wheeler transform.

Bioinformatics 2010, 26(5):589–595.

9. Bushnell B: BBMap: A Fast, Accurate, Splice-Aware Aligner. In.: Lawrence Berkeley National

Lab.(LBNL), Berkeley, CA (United States); 2014.

10. Kukurba KR, Montgomery SB: RNA Sequencing and Analysis. Cold Spring Harb Protoc 2015,

2015(11):951–969.

11. Wu TD, Nacu S: Fast and SNP-tolerant detection of complex variants and splicing in short

reads. Bioinformatics 2010, 26(7):873–881.

12. Wei X, Wang X: A computational workflow to identify allele-specific expression and epigen-

etic modification in maize. Genomics Proteomics Bioinfor 2013, 11(4):247–252.

13. Grant GR, Farkas MH, Pizarro AD, Lahens NF, Schug J, Brunk BP, Stoeckert CJ, Hogenesch

JB, Pierce EA: Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq

unified mapper (RUM). Bioinformatics 2011, 27(18):2518–2528.

14. Kim D, Paggi JM, Park C, Bennett C, Salzberg SL: Graph-based genome alignment and

genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 2019, 37(8):907–915.

15. Corchete LA, Rojas EA, Alonso-López D, De Las Rivas J, Gutiérrez NC, Burguillo FJ:

Systematic comparison and assessment of RNA-seq procedures for gene expression quantita-

tive analysis. Sci Rep 2020, 10(1): 19737.

16. Okonechnikov K, Conesa A, García-Alcalde F: Qualimap 2: advanced multi-sample quality

control for high-throughput sequencing data. Bioinformatics 2016, 32(2):292–294.